AdaBoost is Consistent
نویسندگان
چکیده
The risk, or probability of error, of the classifier produced by the AdaBoost algorithm is investigated. In particular, we consider the stopping strategy to be used in AdaBoost to achieve universal consistency. We show that provided AdaBoost is stopped after n iterations—for sample size n and ν < 1—the sequence of risks of the classifiers it produces approaches the Bayes risk if Bayes risk L∗ > 0.
منابع مشابه
CS 260 : Machine Learning Theory Lecture 14 : Generalization Error of AdaBoost
We saw last time that the training error of AdaBoost decreases exponentially as the number of rounds T grows. However, this says nothing about how well the function output by AdaBoost performs on new examples. Today we will discuss the generalization error of AdaBoost. We know that AdaBoost gives us a consistent function quickly; the bound we derived on training error decreases exponentially, a...
متن کاملBoosting versus Covering
We investigate improvements of AdaBoost that can exploit the fact that the weak hypotheses are one-sided, i.e. either all its positive (or negative) predictions are correct. In particular, for any set of m labeled examples consistent with a disjunction of k literals (which are one-sided in this case), AdaBoost constructs a consistent hypothesis by using O(k logm) iterations. On the other hand, ...
متن کاملTechnical Reports on Mathematical and Computing Sciences:TR-C194
We investigate further improvement of boosting in the case that the target concept belongs to the class of r-of-k threshold Boolean functions, which answers “+1” if at least r of k relevant variables are positive, and answers “−1” otherwise. Given m examples of a r-of-k function and literals as base hypotheses, popular boosting algorithms (e.g., AdaBoost [FS97]) construct a consistent final hyp...
متن کاملADABOOST ENSEMBLE ALGORITHMS FOR BREAST CANCER CLASSIFICATION
With an advance in technologies, different tumor features have been collected for Breast Cancer (BC) diagnosis, processing of dealing with large data set suffers some challenges which include high storage capacity and time require for accessing and processing. The objective of this paper is to classify BC based on the extracted tumor features. To extract useful information and diagnose the tumo...
متن کاملLearning r-of-k Functions by Boosting
We investigate further improvement of boosting in the case that the target concept belongs to the class of r-of-k threshold Boolean functions, which answer “+1” if at least r of k relevant variables are positive, and answer “−1” otherwise. Given m examples of a r-of-k function and literals as base hypotheses, popular boosting algorithms (e.g., AdaBoost) construct a consistent final hypothesis b...
متن کامل